-
Notifications
You must be signed in to change notification settings - Fork 15
Conversation
|
||
A client node MUST send all filter requests in a `FilterSubscribeRequest` message. | ||
This request MUST contain a `request_id`. | ||
The `request_id` MUST be a uniquely generated string. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if an user sends two FilterSubscribeRequest
with the same request_id
? Would that generate an error? or would the latest request received overwrite the subscription parameters of the first one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The latest request received would overwrite the subscription parameters of the first be considered in addition to the previous one - if it SUBSCRIBE
s to a different set of content topics as the first, the client will now be subscribed to the content topics of both requests. We suggest now that service nodes only keep one subscription per peer (i.e. per peerId
) and simply service requests as they come in (without keeping any requestId
state - just making sure responses use the same requestId
).
Reusing a request_id
is now a spec violation on the side of the client, so the client would have to deal with the ambiguity of two responses on the same request_id
(what if one fail and one succeeds?). It could then use the refresh workflow to ensure the desired filter is installed (by just repeating its SubscribeRequest).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, qualifying here: the latest request received would be considered in addition to the parameters of the first. It's possible to build a subscription by sending multiple FilterSubscribeRequest
s variously adding and removing content_topic
s. This is an NB change - namely, SUBSCRIBE
no longer replaces the entire subscription, it simply subscribes a client to the included filter criteria in addition to what it's already subscribed to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be undefined behaviour as the client would not follow the spec in this case. It is a MUST for the request_id
to be unique.
Imo, it would be most consistent if the implementation simply dropped the second request with the same id.
The IETF says: "Be strict in what you send but tolerant in what you accept" (I'm paraphrazing, the original was better 😅 ), but here, accepting the second filter would contradict the spec.
We could add a line making this explicit:
When receiving a second request with the same request_id
, the request is silently dropped.
(oh, I wrote this in a not up-to-date browser window where your comments were not there yet @jm-clius 😅 )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Imo, it would be most consistent if the implementation simply dropped the second request with the same id.
True, but we've now removed requestId
as some identifier for the subscription and we don't (want to) keep track of requestId
s in the service node. SUBSCRIBE
now simply means subscribing to the included filter criteria (not replacing/removing any existing subscription - just extending it). The requestId
, from the service node perspective, now simply ties the response to the request and is discarded directly afterwards by the service node. If it receives a subsequent SUBSCRIBE
with the same requestId
it will simply try to honor it with no reference to previous requestId
s.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❇️ LGTM ❇️ very nice update.
For future PRs:
As we are planning to introduce pubsub topic based sharing, and only exposing the content topic to applications,
it would make sense to remove the pubsub topic from filter once Waku v2 supports sharing.
This would make filter easier, removing fields in the request.
And push can directly push Waku messages.
(Match the RFC template, already mentioned in OP)
Add a implementation suggestions
section and make explicit that filter-push does not require a new transport connection, just a dial from the service node to the client, reusing the existing transport.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Keen to implement it and see how it goes.
content/docs/rfcs/12/README.md
Outdated
to allow filter clients to subscribe, modify, refresh and unsubscribe a desired set of filter criteria. | ||
The combination of different filter criteria for a specific filter client node is termed a "subscription". | ||
A filter client is interested in receiving messages matching the filter criteria in its registered subscriptions. | ||
It is RECOMMENDED that filter service nodes allow only one subscription per client. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this recommendation may lead to issue:
- Alice subscribes for
/content/topic/1
to Bob - Alice does a handshake with Carol, and hence needs to subscribe to
/content/topic/2
to Bob
If only one subscription is to be maintain, what should Alice do?
A. unsubscribe and re-subscribe for both content topic?
B. update the subscription to add the new content topic?
A may lead to missed messages
B does not seem possible from the current spec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see the confusion that the current phrasing may lead to. B is indeed possible and the recommended way of doing it.
Note that:
The combination of different filter criteria for a specific filter client node is termed a "subscription".
And this can be modified as you go along by subscribing and unsubscribing from content topics.
The idea is that a service node should not maintain multiple sets of such filter criteria for the same client. This used to be the case with the previous spec, where the same client could register the same filter criteria several times at the same service node, each time using a different request ID. I think I should remove this confusing wording.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed this line in 29ae394 as it didn't add to the functionality and would just confuse readers. The point should be that any client can modify their filter criteria as many times as they want, adding and removing content topics on the fly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry my initial comment was not valid, I intended to delete it as I read the rest of the specs. I thin it's clear that Alice should update her subscription as I finished to read the spec.
A filter service node SHOULD push all messages | ||
matching the filter criteria in a registered subscription | ||
to the subscribed filter client. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should a client worry about service nodes that push messages when not subscribed to?
This could be the case if for example a content or pubsub topic is RLN protected because processing of messages is costly.
To DDOS a client, a node could unsolicited messages, even tho the client never subscribed to this service node.
The only way a client can protect itself would be to track the peer id of nodes it subscribed too, and ensure push messages are coming from those nodes.
This may also be needed done the line for some scoring, to keep track of the reliability of a node.
I think we should mention this possible attack vector in the spec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed! Very good point. Thanks. In the next PR I want to adhere to the RFC template, which includes a section on security, attack vectors, etc.
Adding @cammellos to conversation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a nice starting point. I have some comments:
- I miss a section, after terminology and before the adversarial model, that explains the different protocol workflows from a general point of view (e.g., subscription registration and message push, subscription refresh, subscription removal).
- Also, I noticed that the RFC covers mainly the happy path, and we should also describe at least the possible failure points (check the comments below for some examples). We can provide recommended strategies for those scenarios in the future.
Please, also take a look at the comments below.
content/docs/rfcs/12/README.md
Outdated
If the `pubsub_topic` is set and not empty, | ||
a `WakuMessage` will only match the filter criteria if it has a matching `content_topic` and was published on _the same_ `pubsub_topic`. | ||
If the `pubsub_topic` is either empty or unset, | ||
a `WakuMessage` will match the filter criteria if it has a matching `content_topic` and was published on _any_ `pubsub_topic`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that we should be more restrictive in the spec and make the pubsub_topic
mandatory. So, there should be always a pubsub_topic
and a list of content_topic
.
// Filter criteria
string pubsub_topic = 10;
repeated string content_topics = 11;
In a scenario where we apply sharding to the Waku relay networks, a node will be subscribed to "several" pub-sub topics. If the gateway node subscribes to a network with the same content topic as our network of interest, the user may receive mixed or duplicated messages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I agree with you.
This spec was written when there was still ambiguity on whether a content topic strictly match to only one pubsub topic or could span several shards (in which case the mixed/duplicate messages would be desired).
I think we're luckily now moving in a direction where we can see a content topic as existing only within the context of a specific pubsub topic (and in future may even abstract the underlying pubsub topic away). In other words, we can see the (pubsubTopic, contentTopic)
vector as the minimal unique filter criterion. Pinging @kaiserd to double check direction here and will then update spec accordingly (needs changes in a couple of places).
|
||
A filter client that sends a `FilterSubscribeRequest` with `filter_subscribe_type` set to `SUBSCRIBER_PING` | ||
requests that the service node SHOULD indicate if it has any active subscriptions for this client. | ||
The filter client SHOULD exclude any filter criteria from the request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if a client sends the ping together with a subscription list? Should the node respond with a failure? Should it just ignore it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point! Have specified in cd3e880.
The filter client MUST include the desired filter criteria in the request. | ||
A client MAY use this request type to _modify_ an existing subscription | ||
by providing _additional_ filter criteria in a new request. | ||
A client MAY use this request type to _refresh_ an existing subscription |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand that subscriptions have an expiration time in the server that should remove a subscription if the node has not refreshed it via SUBSCRIBE
within the time.
Is that correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, within the filter service node the protocol does not mandate any management of subscriptions. It would be up to the client to ensure it maintains (redundant) subscriptions across filter service nodes. The implementation section (next) will recommend a maximum interval between ping for clients. Filter service nodes are specified so that they in general "keep as many subscriptions as they can, for as they long as their resources allow". The point being that a client should not expect its subscriptions to be reliable (we want to build a network where it is, of course, this is about specification).
A client MAY use this request type to _modify_ an existing subscription | ||
by providing _additional_ filter criteria in a new request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it is a bit nitpicking, but:
A client MAY use this request type to _modify_ an existing subscription | |
by providing _additional_ filter criteria in a new request. |
I would model every "pubsub_topic+content_topic" as an individual subscription. So a node, identified by its PeerId/NodeId, can have multiple transactions. You either add a new subscription or remove an existing subscription. There is no updating of the subscription.
A subscription request with multiple content_topics
is a convenient way of batching multiple subscriptions in a single request without sending multiple requests, one per content_topic
.
IMO This simplifies the implementation a lot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes! This is the plan in terms of how it will be implemented.
After trying to settle on terminology I came up with the earlier stated:
The combination of different filter criteria for a specific filter client node is termed a "subscription".
Each filter criterion/transaction will indeed be modelled as a "pubsub_topic+content_topic" which can only be added or removed. I found that people most often spoke about that action as "updating my subscription", but am not set on the terminology.
The filter service node SHOULD respond with a success code if it has any active subscriptions for this client | ||
or an error code if not. | ||
|
||
#### SUBSCRIBE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if a node is not subscribed to a network (a.k.a. pubsub_topic
)? Does the node receive an error? Which error code?
I think we should specify the "common" errors. I do not know where, but it is necessary so all the implementations use the same error codes. Something like IETF's RFC 9110: https://www.rfc-editor.org/rfc/rfc9110.html#name-status-codes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree re specifying the common errors. I think this should likely be in a separate BCP RFC with a separate lifecycle (can be draft
and extended while this one is stable
. This RFC aims to be as minimal as possible in order to define the smallest set of rules that allow for stable operation.
|
||
The following filter subscribe types are defined: | ||
|
||
#### SUBSCRIBER_PING |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still need to finish seeing the point of this message 🤔
In my head: If a client sends a subscribe message, with the server's response, it will be sure that there is a subscription. And additionally, we are already sending a "subscription refresh" request from the client.
On the other hand, I see the value in updating and pruning the subscriptions if the subscriber does not respond to a server-sent ping. But this will overlap with the peer management and the connection's keep-alive mechanisms (e.g., libp2p's ping).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I was also unsure how useful this would be, so wanted to get the most common clients' perspective to understand if this has any value. (See e.g. point 4 in the description)
it will be sure that there is a subscription
But the server provides no guarantees to maintain this subscription over time. The client's responsibility is to have some redundancy and a ping mechanism to ensure that its subscriptions remain active.
we are already sending a "subscription refresh" request from the client
Indeed why I wasn't sure if the lightweight ping is necessary, but note that refreshing a complex subscription with many content topics will be expensive, whereas this envisions a much more frequent ping mechanism. It could be a special case of subscription refresh with empty filter criteria, but IMO that would be abusing the mechanism and I'd rather have something explicit.
this will overlap with the peer management and the connection's keep-alive mechanisms
This functions on a different level - it's not about whether there is an active connection. For filter we assume that connections may drop from time to time. For whichever reason, the service node may remove your subscription (e.g. due to a restart, reaching capacity, etc.) and the client needs a lightweight way to ping its subscription - not strictly to keep it alive, but to maintain a minimum set of subscriptions across multiple service nodes.
content/docs/rfcs/12/README.md
Outdated
Since the filter protocol does not include caching or fault-tolerance, | ||
this is a best effort push service with no bundling or retransmission of messages. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How should an implementation manage delivery failures? Should it drop the subscription? Should it retry to push every message?
I think that defining a push failure strategy is a complex scenario that the spec cannot leave open to the implementations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I've added this in 7c47eb0.
The idea is to specify this is in such a way that it does not preclude service nodes from attempting retransmissions, but does not build a dependency on this mechanism into the minimal protocol definition either.
In addition we should provide a BCP where we're opinionated on implementation. Good catch!
|
||
// Filter criteria | ||
optional string pubsub_topic = 10; | ||
repeated string content_topics = 11; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should write a recommended maximum number of content topics per subscription request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree and good point! Will add that to the implementation suggestions in a follow-up PR.
- _filter-subscribe_: `/vac/waku/filter-subscribe/2.0.0-beta1` | ||
- _filter-push_: `/vac/waku/filter-push/2.0.0-beta1` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't finish to see why we need to split the protocol ID into two different protocols 🤔 In my head:
The filter protocol acts as a bidirectional protocol with a control path and a data path (like libp2p's Gossipsub has):
- The control path: The gateway acts as a server for the "request-response" subscription management flow. The subscriber acts as the client requesting subscriptions.
- The data path: The gateway delivers (via "push") the messages matching the subscription subscribers.
Based on that, it makes sense to announce (via libp2p's identify protocol) the filter protocol ID. A subscriber's peer managers will use the protocol ID to list this node as capable of managing filter subscriptions and pushing the subscriptions.
But what is the point of having a separate "receiver" protocol ID?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand you correctly, this is because the "data path" requires a connection/stream being negotiated from the filter service node to the client who requested the subscription. This requires the client to have some protocol mounted which can be negotiated for this stream (where it is acting like a "service"). If this is simply the same as the subscribe protocol ID, the subscriber's peer manager will incorrectly list other subscribers (not capable of managing subscriptions/pushing) as possible service nodes. Not sure if this is what you meant?
I agree re similarities to gossipsub, but note that this is not currently a true "bidirectional" protocol. It does negotiate streams/connections in two directions, but other than gossipsub the capabilities of each "side" is not the same. For gossipsub each node acts as both a server and a client, whereas filter still has a server-side and client-side which needs to be differentiated.
@@ -61,77 +63,150 @@ The following are not considered as part of the adversarial model: | |||
## Protobuf |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer the term "Wire format". It is more generic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree! Note that this PR specifically avoids format/heading/layout changes so that only the suggested wire format change can be reviewed. Once done, next step would be to adhere to the template and add theory, implementation, etc. sections.
Excellent feedback, thanks @LNSD. FWIW, this spec still requires a rewrite to adhere to the template, so only focuses on the wire format.
Indeed! I'd add that to the TBD
Agree! Have already added some clarifications re possible failure points based on your comments and will do a further review to add more details.
Yes. Follow-ups will be adding a minimal implementation section to this RFC and then a more detailed BCP RFC that will describe more advanced strategies (including how to use filter in combination with Waku Peer Exchange to get a good, random subset of service nodes for reliability, combining with store as "backup" and using message uids to sync) |
#### UNSUBSCRIBE | ||
|
||
A filter client that sends a `FilterSubscribeRequest` with `filter_subscribe_type` set to `UNSUBSCRIBE` | ||
requests that the service node SHOULD _stop_ pushing messages matching this filter to the client. | ||
The filter client MUST include the filter criteria it desires to unsubscribe from in the request. | ||
A client MAY use this request type to _modify_ an existing subscription | ||
by providing _a subset of_ the original filter criteria to unsubscribe from in a new request. | ||
The filter service node SHOULD respond with a success code if it successfully honored this request | ||
or an error code if not. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if I a client sends an UNSUBSCRIBE
request containing a pubsub_topic
and no content_topics
? (since a repeated
field can contain 0 items)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, thanks for picking this up! It should definitely be clarified in the specs. My view is that SUBSCRIBE
and UNSUBSCRIBE
(in fact the idea of subscriptions) should be related to content topics, qualified by a (perhaps now mandatory) pubsub topic. That means that only content topics will be considered in subscribe and unsubscribe requests. Not having any content topics will be seen as an empty unsubscribe request and SHOULD be ignored.
Of course, LMK if you believe from client perspective if a different semantic would work better, but the main idea of Filter is to operate on content topics.
#### SUBSCRIBE | ||
|
||
A filter client that sends a `FilterSubscribeRequest` with `filter_subscribe_type` set to `SUBSCRIBE` | ||
requests that the service node SHOULD push messages matching this filter to the client. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a subscriber only specifies the pubsub_topic
and no content_topic
, would they receive all messages that match the pubsub topic regardless of the content type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my comment on the UNSUBSCRIBE
above. In short, pubsub topic is only seen as a qualifier for the content topic. In other words, having no content topics SHOULD be treated as a an empty SUBSCRIBE and SHOULD be ignored. This indeed needs to be clarified.
FilterSubscribeType filter_subscribe_type = 2; | ||
|
||
// Filter criteria | ||
optional string pubsub_topic = 10; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it mean we can only have subscriptions for one pubsub topic for a given remote node?
I'd assume with topic sharding, a light client may need to subscribe to several pubsub topics (community topic + 1:1 topic at least) and if I understand correctly, any subscription request replaces the previous one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Answered elsewhere, but adding here for completeness:
No, a client can subscribe to content topics within as many pubsub topics as it likes. The limitation here is only that each request can only subscribe to content topics belonging to the same, single pubsub topic. A client can simply send a subsequent request with content topics belonging to another pubsub topic if it's interested in those.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this NOT be optional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that this is only optional within the context of protobuf field design: some types of FilterSubscribeRequests (e.g. SUBSCRIBER_PING, UNSUBSCRIBE_ALL) SHOULD NOT have a populated pubsub_topic. Within the context of the protocol itself this field is conditionally mandatory (which would require optional in the protobuf), e.g. the pubsub_topic MUST be populated, if the filter_subscribe_type is SUBSCRIBE or UNSUBSCRIBE.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM ❇️
(wrt draft RFC versioning)
This can be used as a reference until the process is covered in COSS :).
This PR addresses #469 and waku-org/pm#5
It proposes a change to the Filter protocol to:
Why two protocol IDs for a single protocol?
Note that this is a single protocol, but uses @richard-ramos's idea of using two different libp2p protocol-ids. This will enable simple capability discovery based on libp2p-identify and is needed for nodes (e.g. Status Desktop instances) to provide a filter service to resource restricted nodes.
What to review?
Please review the:
I have deliberately not changed the RFC structure to conform to the template, so that the proposed changes are more visible in the diff.
The protobuf will also be added to the protobuf repository and compiled for consistency checks.
These will be done in subsequent PRs. The idea in this PR is simply to reach agreement on protocol IDs, protobuf structure and overall functionality/scope.
Some remarks/questions
MessagePush
contains only a singleWakuMessage
. This is because service nodes should simply forward matching messages to the clients without any caching/bundling in this version of filter protocol.Store
protocol is the only caching protocol and can be used in combination withfilter
to provide cached messages matching specific filter criteria.I'm still ambivalent on whether thisUPDATE: @richard-ramos pointed out that we also need thewaku_message
needs to be nested in aMessagePush
at all or if we simply forward the rawWakuMessage
over the wire when pushing.pubsub_topic
for every Push, so the nesting is necessary.SUBSCRIBE
) to ensure subscription continuity